Skip to content

Various fixes#78

Merged
clemsgrs merged 31 commits intomainfrom
codex/various-fixes
Mar 23, 2026
Merged

Various fixes#78
clemsgrs merged 31 commits intomainfrom
codex/various-fixes

Conversation

@clemsgrs
Copy link
Owner

@clemsgrs clemsgrs commented Mar 18, 2026

Summary

  • add tile-store-aware embedding paths plus on-the-fly slide reading backends, including cuCIM supertiles, WSD tile reading, backend auto-resolution, and separate preprocessing vs embedding worker controls
  • add benchmark coverage and tooling for embedding throughput, tile read strategies, and end-to-end path comparisons, along with progress/timing instrumentation for loader, reader, preprocessing, and forward-pass breakdowns
  • tighten pretrained model handling: validate recommended input size, spacing, and precision by default, add allow_non_recommended_settings as an explicit opt-out, support CONCHv1.5, align MUSK/Virchow defaults with upstream behavior, and clean up model preset docs/configs

@clemsgrs clemsgrs changed the title [codex] various-fixes Various fixes Mar 18, 2026
@clemsgrs clemsgrs marked this pull request as ready for review March 18, 2026 23:41
@clemsgrs clemsgrs force-pushed the codex/various-fixes branch from 60822b9 to 01e8bdf Compare March 20, 2026 14:03
clemsgrs and others added 21 commits March 20, 2026 19:23
Read tiles directly from WSI during embedding via cucim's batched
read_region, eliminating the tar creation step. Super tiles (8x8, 4x4,
2x2 blocks) reduce the number of read calls. A custom batch sampler
(adaptive_batching) optionally aligns DataLoader batches to super tile
boundaries to avoid redundant reads.

New config options under tiling:
- on_the_fly (default true): read from WSI instead of tar
- gpu_decode (default false): experimental GPU JPEG decoding
- adaptive_batching (default false): vary batch size to match super tiles

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- remove dead try/except around gpu_decode dict assignment
- remove unused variable in read_batch
- update docstring to mention 2x2 blocks
- fix on_the_fly default mismatch in from_config fallback (False → True)
- use np.isin instead of set for DDP tile filtering

Co-Authored-By: Claude Opus 4.6 <[email protected]>
The flat arrays (tile_to_st, tile_crop_x, tile_crop_y) are the only
lookup structures used in the hot path. The per-super-tile duplicates
were never read.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
PyTurboJPEG>=2 requires libjpeg-turbo 3.0+. Ubuntu 22.04 ships 2.x via
libturbojpeg0-dev. Build libjpeg-turbo 3.1.0 from source in the build
stage and copy the shared libs to the runtime stage.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Pulls in hs2p[cucim] (cucim-cu12, cupy-cuda12x, nvidia-nvimgcodec-cu12)
and PyTurboJPEG.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
_WSDTarReadPlan now carries tile_indices directly, so _build_supertile_index
no longer needs to reconstruct member coords via coord_to_index — replaced
with iter(plan.tile_indices). Also bumps hs2p requirement to >=2.4.1 which
introduced the top-down super tile grouping and the tile_indices field.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Threads hs2p's jpeg_backend parameter (turbojpeg/pil) through
PreprocessingConfig → _tile_slides so callers can pick the JPEG
encoder used during tar extraction.  Defaults to "turbojpeg".

Sets jpeg_backend="pil" in test_output_consistency so the tar
path matches the PIL-encoded ground truth fixtures.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Adds num_cucim_workers (default 4) to PreprocessingConfig. In the
on-the-fly path, DataLoader num_workers is auto-derived as
cpu_count // num_cucim_workers instead of reusing speed.num_workers,
avoiding oversubscription when both are large. A log message is emitted
when the computed value differs from speed.num_workers.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Adds num_preprocessing_workers to ExecutionOptions, read from
speed.num_preprocessing_workers (fallback speed.num_workers).
_tile_slides now uses num_preprocessing_workers for hs2p, while
execution.num_workers (speed.num_dataloader_workers) remains the
DataLoader knob for the tar embedding path.

Default yaml updated with explicit num_preprocessing_workers and
num_dataloader_workers keys. Backward-compatible via num_workers
fallback in from_config.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…eading paths

The batch preprocessor was applying a bilinear resize to target_tile_size_px
before model transforms, while the old TileDataset path used PIL Lanczos and
the model's own Resize transform. For the test WSI (read_tile_size=444px,
target=224px) this produced noticeably different pixel values.

Fix: defer resizing to the model's own Resize transform when one exists,
extracting its interpolation mode (e.g. BICUBIC for Virchow) into
BatchTransformSpec.resize_interpolation and applying it via F.interpolate.
Only fall back to a bilinear pre-resize when the model has no Resize step.

The CI consistency test is updated to always run on_the_fly=True (direct WSI
reads, no JPEG round-trip) and drops jpeg_backend which is tar-path-only.
GT embedding fixture regenerated with the new consistent preprocessing.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@clemsgrs clemsgrs merged commit 8c2be1e into main Mar 23, 2026
2 checks passed
@clemsgrs clemsgrs deleted the codex/various-fixes branch March 23, 2026 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant